Reconstructing the free energy landscape of a mechanically unfolded model protein 
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The equilibrium free energy landscape of an off-lattice model protein as a function of an internal 
(reaction) coordinate is reconstructed from out-of-equilibrium mechanical unfolding manipulations. 
This task is accomplished via two independent methods: by employing an extended version of the 
Jarzynski equality (EJE) and the protein inherent structures (ISs). In a range of temperatures 
around the "folding transition" we find a good quantitative agreement between the free energies 
obtained via EJE and IS approaches. This indicates that the two methodologies are consistent and 
able to reproduce equilibrium properties of the examined system. Moreover, for the studied model 
the structural transitions induced by pulling can be related to thermodynamical aspects of folding. 
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The properties of the (free) energy landscape can heav- 
ily influence the dynamical and thermodynamical fea- 
tures of a large class of systems: supercooled liquids, 
glasses, atomic clusters and biomolecules [H. In par- 
ticular, the shape of the landscape plays a major role 
in determining the folding properties of proteins A 
fruitful approach to the analysis of the landscape relies 
on the identification of the local minima of the potential 
energy, i.e. the "inherent structures" (ISs) of the system 
Q . The investigation of the ISs has lead to the identifi- 
cation of the structural-arrest temperature in glasses Q 
and supercooled liquids More recently, this kind of 
analysis has been extended to the study of proteins 0, 0] ■ 

Mechanical unfolding of single biomolecules represents 
a powerful technique to extract information on their in- 
ternal structure as well as on their unfolding and re- 
folding pathways Q. However, mechanical unfolding of 
biomolecules is an out-of-equilibrium process: unfold- 
ing events occur on time scales much shorter than the 
typical relaxation time of the molecule towards equilib- 
rium. Nonetheless, by using the equality introduced by 
Jarzynski the free energy of mechanically manipu- 
lated biomolecules can be recovered as a function of an 
externally controlled parameter [ld| . 

In this Letter, we reconstruct the equilibrium free en- 
ergy landscape (FEL) associated to a mesoscopic off- 
lattice protein model as a function of an internal coordi- 
nate of the system (namely, the end-to-end distance Q . 
At variance with previous studies 11 , [13, [l3| , here we 



five state from the completely stretched configuration 
The model studied in this paper is a modified ver- 



exploit two independent methods: one based on an ex- 
tended version of the Jarzynski equality (EJE) and the 
other on thermodynamical averages over ISs. Moreover, 
the agreement of the results obtained with the two ap- 
proaches indicates that these two methodologies can be 
fruitfully integrated to provide complementary informa- 
tion on the protein landscape. In particular the inves- 
tigation of the ISs allows us to give an estimate of the 
(free) energetic and entropic barriers separating the na- 



sion of the 3d off-lattice model introduced in Ref. [14 1 
and successively generalized to include a harmonic inter- 
action between next-neighbouring beads instead of rigid 
bonds [l^, [iBl • The model consists of a chain of 46 point- 
like monomers mimicking the residues of a polypeptidic 
chain, where each residue is of one of the three types: 
hydrophobic {B), polar (P) and neutral (N) ones. 

The residues within the protein interact via an off- 
lattice coarse-grained potential composed of four terms: 
a stiff nearest-neighbour harmonic potential intended to 
maintain the bond distance almost constant; a three- 
body bending interaction associated to the bond an- 
gles; a four-body interaction mimicking the torsion ef- 
fects; and a long-range Lennard-Jones potential repro- 
ducing in an effective way the solvent mediated inter- 
actions between pairs of residues non covalently bonded 
dzl. The 46-mer sequence BgN'i{PB)iN:iBgN:i{PB)zP , 
which exhibits a four stranded /3-barrel Native Configu- 
ration (NC), is here analyzed with the same potential and 
parameter set reported in Ref. [3], but we neglect any 
diversity among the hydrophobic residues. This sequence 
has been previously studied, for different choices of the 
potential parameters, in the context of spontaneous fold- 
ing 14 , 3, 3 S 0| as well as of mechanical unfolding 
and refolding [20|, |2l| . The NC is stabilized by the at- 
tractive hydrophobic interactions among the B residues, 
in particular the first and third Bg strands, forming the 
core of the NC, are parallel to each other and anti-parallel 
to the second and fourth strand, namely, {PB)^ and 
{PB)^P. The latter strands are exposed towards the 
exterior due to the presence of polar residues. 

The main thermodynamic features can be summarized 
with reference to three different transition temperatures 
[E 0, : the 0-temperature Tg discriminating between 
phases dominated by random-coil configurations rather 
than collapsed ones; the folding temperature T/, below 
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which the protein stays predominantly in the native val- 
ley; and the glassy temperature Tg indicating the freez- 
ing of large conformational rearrangements [7|]. Follow- 
ing the procedures reported in Ref. we have de- 
termined these temperatures and obtained Tg — 0.65(1), 
Tf = 0.28(1), and Tg = 0.12(2). 

In order to mimic the mechanical pulling of the pro- 
tein attached to an AFM cantilever, or trapped in optical 
tweezers, one extremum of the chain was kept fixed, and 
the last bead was attached to a pulling device with a 
spring of elastic constant k. The external force is applied 
at time i = by moving the device along a fixed direction 
with a constant velocity protocol z{t) = z(0) -I- Vpt. The 
protein is initially rotated to have the first and last bead 
aligned along the pulling direction, therefore the external 
potential reads Uz{t){C) = K{z{t) — C)^/2- Moreover, to 
reproduce the experimental conditions, the thermaliza- 
tion procedure consists of two steps: a first stage when 
the protein evolves freely starting from the NC, followed 
by a second one in presence of the pulling apparatus. The 
resulting configuration is then used as the starting state 
at t = for the forced unfolding performed at constant 
temperature via a low friction Langevin dynamics [26l| . 

Following Ref. Uj, we briefiy review how to recon- 
struct the equilibrium FEL as a function of the collec- 
tive coordinate ( starting from out-of-equilibrium mea- 
surements. Let the system (unperturbed) Hamiltonian 
Hq(x) be a function of the positions and momenta of the 
residues x = {ri,pi}, the free energy of the constrained 
ensemble, characterized by a given value C of the macro- 
scopic observable C{x), reads (3f{() = — In Jdx6{C — 
(^(x)) e~^-'^°^^\ The system is driven out-of-equilibrium 
by the external potential, f/z(t)(C), and the work done on 
the system by the external force associated to Uzi^t){C) 
is Wt = /q dr Vp K {z{t) — ({x{t))). Due to thermal 
fluctuations the trajectory x{t) followed by the system, 
and therefore Wt, varies between one realization of the 
manipulation process and the other. In Ref. [23j an ex- 
tended version of the Jarzynski equality relate /(C) to 
the work done on the system, for arbitrary external po- 
tential. Such a relation reads 

{S{C - C(^))e-^'^), = e-'^(/(';)+^^(')(^))/Zo, (1) 

where Zq = J da; exp[— /3_ffo(a;)] and the average 
is performed over different trajectories with fixed time- 
length t. Technical details for the optimal sampling of 
the Ihs of eq. 
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HI) are discussed in Refs. 
As shown in fig. [1] the estimated FEL collapses into 
an asymptotic curve as the pulling velocity decreases in 
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Let us 



agreement with the results reported in 
now discuss, by referring to fig. [1] the structural transi- 
tions (STs) induced by the pulling. As shown in the in- 
set, the asymptotic /(C) profile exhibits a clear minimum 
in correspondence of the end-to-end distance of the NC 
(namely, Co ~ 1.9). Moreover, up to C ~ 6, the protein 
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FIG. 1: (Color online) Free energy profile / as a function 
of the end-to-end distance C, obtained by eq. ((T| for various 
pulling velocities: from top to bottom = 5 x 10~^, 1 x 10"'^, 
5 X 10~^, 5 X 10"'' and 2 x 10"''. In the inset, an enlargement 
of the curve for = 5 x 10~^ at low C is reported. Each curve 
have been obtained by averaging over 160 — 240 repetitions 
of the same pulling protocol at T = 0.3. The letters indicate 
the value of /(C) for the configurations reported in fig. [2] and 
the (blue) dashed lines the location of the STs. 



remains in native-like configurations characterized by a 
/3-barrel made up of 4 strands, while the escape from the 
native valley is signaled by the small dip at C ~ 6 and it 
is indicated in the inset of fig. [T] as STl. This ST has 
been recently analyzed in 2l| in terms of the potential 
energy of ISs. For higher C the configurations are charac- 
terized by an almost intact core (made of 3 strands) plus 
a stretched tail corresponding to the pulled fourth strand 
(see (b) and (c) in fig. [J). The second ST amounts to 
pull the strand {PB\P out of the barrel. In order to do 
this, it is necessary to break 22 hydrophobic links [2^, 
amounting to an energy cost ~ 21. The corresponding 
free energy barrier height is instead quite lower (~ 11, as 
estimated from fig.[l|). Since the potential energy barrier 
is essentially due to the hydrophobic interactions this im- 
plies that a non negligible entropic cost is associated to 
ST2. Instead, in the range 13<C<18.5the curve /(C) 
appears as essentially flat, thus indicating that almost 
no work is needed to completely stretch the tail once de- 
tached from the barrel. The pulling of the third strand 
(that is part of the core of the NC) leads to a definitive 
destabilization of the /3-barrel and to the breakdown of 
the remaining 36 BB-links with an energetic cost ^ 35. 
A finite entropic barrier should be associated also to this 
final stage of the unfolding (termed ST3), because the 
energy increase due to the hydrophobic terms is much 
higher than the free energy barrier ('^ 26, see ST3 in 
fig- [J)- The second plateau in /(C) corresponds to pro- 
tein structures made up of a single strand (similar to (d) 
in fig. [2]). The final quadratic rise of /(C) for C > 36 is 
associated to the stretching of bond angles and distances 
beyond their equilibrium values. 

As shown in fig. [31 the FEL is strongly affected by 
temperature variations. In particular, for temperatures 
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the basins of attraction 
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FIG. 2: Pulled configurations at T = 0.3: the NC (a) has 
(^0 ~ 1.9; the others are characterized by = 6.8 (b), ^ = 16.8 
(c), and C = 27.1 (d). 




FIG. 3: (Color online) Free energy profile /((") as obtained 
by eq. lU for various temperatures: T — 0.2 (magenta stars), 
0.4 (blue plus), 0.5 (red squares), 0.6 (green triangles) and 
0.7 (orange circles). In the inset an enlargement is reported 
at small The data refer to Vp = 5 x 10~^. 



around Tf one still observes a clear minimum around 
and a FEL resembling the one found for T — 0.3. A 
native-like minimum is still observable for T = 0.5 < Tg, 
however its position ^ > C^o indicates that the NC is no 
longer the most favourite configuration. Furthermore the 
dip around ~ 6 — 7 disappears and the heights of the 
two other barriers reduce. By approaching Tg the min- 
imum broadens noticeably and the first barrier almost 
disappears, thus suggesting that 4 stranded /3-barrel con- 
figurations coexist with partially unfolded ones. Above 
Tg only one barrier remains and the absolute minimum 
is now associated to extended conformations similar to 
type (b) or (c) with some residual barrel structure. 

Let us now introduce the reconstruction of the free en- 
ergy in terms of the inherent states (ISs). ISs correspond 
to local minima of the potential energy, in particular the 
phase space visited by the protein during its dynami- 
cal evolution can be decomposed in disjoint attraction 
basins, each corresponding to a specific IS [ll, In this 
context, the free energy can be expressed as a sum over 
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a a J — 1 

where a labels distinct IS and Va (resp. Ra) is the cor- 
responding potential (resp. vibrational free) energy. Ra 
represents an entropic contribution due to the fluctua- 
tions around the considered minimum and is analytically 
estimated by assuming a harmonic basin of attraction in 
terms of the 37V — 6 non zero frequencies {uj^} of the vi- 
brational modes [ij . The harmonic approximation works 
reasonably well up to T ^ Tg, as we have verified by a di- 
rect evaluation of the occupation probabilities of the var- 
ious basins [2^. We have built up two data banks of ISs: 
the thermal data bank (TDB) obtained by performing 
equilibrium canonical simulations and the pulling data 
bank (PDB) by mechanically unfolding the protein 24]. 

In order to estimate the FEL fisiC) as a function of 
the variable ^ characterizing different ISs, the sum in ^ 
should be restricted to ISs with an end-to-end distance 
within a narrow interval [C; C + c'C] 0- 

As shown in fig. 

m the comparison between fis{C) and the f{() obtained 
via the EJE reconstruction in proximity of Tf reveals an 
almost complete coincidence up to C ~ 5, while for larger 
C: //s(C) shghtly underestimates the free energy. This 
disagreement is mainly due to the fact that the IS anal- 
ysis is based only on minima of the potential, while sad- 
dles are completely neglected. The further comparison 
between the IS reconstruction obtained via the TDB and 
PDB clearly indicates that the out-of-equilibrium process 
consisting in stretching the protein is more efficient to in- 
vestigate the FEL, since a much smaller number of ISs 
are needed to well reconstruct it (at least up to C ~ 17). 

The last stage of the unfolding, reveals a difference 
among the two fis- the TDB FEL is steeper with re- 
spect to the PDB one, thus suggesting that the protein 
can reach lower energy states with large ^ during me- 
chanical unfolding, states that have a low probability to 
be visited during the dynamics at thermal equilibrium. 
However the value of the barrier to overcome and that of 
the final plateau are essentially the same. The IS confor- 
mation with the maximal end-to-end distance is the all 
trans-configuration [27[ corresponding to (trans = 35.70, 



therefore the IS approach does not allow to evaluate the 
FEL for ( > (trans- Howcvcr, the IS analysis provides 
us an estimate of the profiles of the potential and vi- 
brational free energies V/s(C) and Ris{C), respectively. 
From the latter quantity, the entropic costs associated to 
the unfolding stages can be estimated. As shown in the 
inset of fig. [4] for T — 0.3 the unfolding stages previously 
described correspond to clear "entropic" barriers. In par- 
ticular, in order to stretch the protein from the NC to the 
all trans-configuration the decrease of RjsiC) is ^ 19, in 
agreement with the previous estimate obtained by con- 
sidering the EJE reconstruction of the FEL. 



4 




FIG. 4: (Color online) Free energy profiles f{() and fisiO as 
a function of the elongation ( for T = 0.3. The black solid line 
refers to the reconstruction in terms of the EJE, while the red 
dashed one corresponds to fjs for a set of pulling experiments 
with Vp = 2 X 10"*. The blue dot-dashed line is the fis{C) 
obtained in terms of the ISs of the TDB. In the insets are 
reported the reconstructed VisiC) (lower panel) and Ris{(,) 
(upper panel) by employing ISs in the PDB. 

Finally, one can try to put in correspondence the 
three unfolding stages previously discussed with ther- 
modynamical aspects of the protein folding. In partic- 
ular, by considering the energy profile V7s(C)i en- 
ergy barrier AVis a-nd a typical transition temperature 
Tt — {2AVis)/{3N), can be associated to each of the 
STs. The first transition STl corresponds to a barrier 
AVis — 8 and therefore to Tt ~ 0.12, that, within error 
bars, coincide with Tg. For the ST2 transition the bar- 
rier to overcome is AVjs — 16 and this is associated to a 
temperature Tt ~ 0.23 (slightly below Tf). The energetic 
cost to completely stretch the protein is ~ 49.7 with a 
transition temperature Tt ~ 0.72, that is not too far from 
the 0-temperature. At least for this specific model, our 
results indicate that the observed STs induced by pulling 
can be put in direct relationship with the thermal transi- 
tions usually identified for the folding/unfolding process. 

We can conclude by noticing that the information ob- 
tained by the equilibrium FEL both with the EJE and 
the IS methodologies are consistent and give substanti- 
ated hints about the thermal unfolding. However, we 
want to point out that these two methods are somehow 
complementary. On the one hand, with the EJE ap- 
proach all the coordinates are projected onto a collective 
one, the contribution of the microscopic configurations 
being averaged out. On the other hand, the IS analysis 
appears more suitable to study the microscopic details of 
the configuration space of complex systems such as pro- 
teins, once the main basins have been identified by using 
the former approach. 



Useful discussions with the members of the CSDC in 
Firenze and L. Peliti are acknowledged, as well as partial 
support by the European Contract No. 12835 - EMBIO. 



* Electronic address: |alberto.imparato@politoTIt 1 
[1] D.J. Wales, Energy Landscapes, Cambridge University 

Press, Cambridge, 2003. 
[2] RE. Leopold et al, Proc. Natl. Acad. Sci. USA 89 (1992) 

8721. 

[3] F.H. Stillinger and T.A. Weber, Science 225 (1984) 983. 
[4] S. Sastry et al, Nature 393 (1998) 554. 
[5] L. Angelani et ai, Phys. Rev. Lett. 85 (2000) 5356. 
[6] A. Baumketner et al., Phys. Rev. E 67 (2003) 011912. 
[7] N. Nakagawa and M. Peyrard, Proc. Natl. Acad. Sci. 

USA 103 (2006) 5279; Phys. Rev. E 74 (2006) 041916. 
[8] M. Carrion- Vazquez et al. Proc. Natl. Acad. Sci. USA 96 

(1999) 3694; B. Onoa et al. Science 299 (2003) 1892. 
[9] C. Jarzynski Phys. Rev. Lett. , 78 (1997) 2690; G.E. 
Crooks Phys. Rev. E 60 (1999) 2721. 
[10] D. Collin et al.. Nature 437 (2005) 231. 
[11] A. Imparato and L. Peliti, J. Stat. Mech. (2006) P03005. 
[12] O. Braun et al., Phys. Rev. Lett. 93 (2004) 158105. 
[13] A. Imparato et al, Phys. Rev. Lett. 98 (2007) 148102. 
[14] J.D. Honeycutt and D. Thirumalai, Proc. Natl. Acad. 

Sci. U.S.A. 87 (1990) 3526. 
[15] T. Veitshans et al.. Folding & Design 2 (1997) 1. 
[16] R.S. Berry et al., Proc. Natl. Acad. Sci. U.S.A. 94 (1997) 
9520. 

[17] All quantities are here expressed in adimensional units 
for a comparison with physical units see [isj . 

[18] Z. Guo and C.L. Brooks III, Biopolymers 42 (1997) 745. 

[19] J. Kim et al. Phys. Rev. Lett. 97 (2006) 050601. 

[20] F.-Y. Li et al., Phys. Rev. E 63 (2001) 021905 

[21] D.J. Lacks, Biophys. J. 88 (2005) 3494. 

[22] A. Torcini et al. J. Biol. Phys. 27 (2001) 181; L. Bongini 
et al. Phys. Rev. E 68 (2003) 061111. 

[23] C. Hummer and A. Szabo, Proc. Natl. Acad. Sci. USA 
98, 3658 (2001). 

[24] In order to find the difi'erent ISs the equilibrium (resp. 
out-of-equilibrium) Langevin trajectory is sampled at 
constant time intervals St — 5 (resp. at constant elon- 
gation increments 5( = 0.1) to pinpoint a series of con- 
figurations, which afterwards are relaxed via a steepest 
descent dynamics. For mechanical unfolding, the protein 
is unblocked and the pulling apparatus removed before 
the relaxation stage. The TDB contains ~ 600, 000 dis- 
tinct ISs collected via equilibrium simulations at various 
temperatures in the range [0.3; 2.0]. The PDB contains 
5, 000 — 50, 000 ISs for each examined temperature. 

[25] We assume that a hydrophobic contact is formed between 
two B-residues if their distance is smaller than 1.25. 

[26] S. Luccioli et al, unpublished. 

[27] This is the elongated equilibrium conformation of the 
protein with all the dihedral angles at their trans values. 



